1,863 research outputs found

    An Empirical Analysis of Income Dynamics among Men in the PSID: 1968–1989

    Get PDF
    This study uses data from the Panel Survey of Income Dynamics (PSID) to address a number of questions about life-cycle earnings mobility. It develops a dynamic reduced-form model of earnings and marital status that is nonstationary over the life-cycle. A Gibbs sampling-data augmentation algorithm facilitates use of the entire sample and provides numerical approximations to the exact posterior distribution of properties of earnings paths. This algorithm copes with the complex distribution of endogenous variables that are observed for short segments of an individual’s work history, not including the initial period. The study reaches several firm conclusions about life cycle earnings mobility. Incorporating non-Gaussian shocks makes it possible to account for transitions between low and higher earnings states, a heretofore unresolved problem. The non-Gaussian distribution substantially increases the lifetime return to postsecondary education, and substantially reduces differences in lifetime wages attributable to race. In a given year, the majority of variance in earnings not accounted for by race, education, and age is due to transitory shocks, but over a lifetime the majority is due to unobserved individual heterogeneity. Consequently, low earnings at early ages are strong predictors of low earnings later in life, even conditioning on observed individual characteristics.

    Sequentially Adaptive Bayesian Learning for a Nonlinear Model of the Secular and Cyclical Behavior of US Real GDP

    Full text link
    There is a one-to-one mapping between the conventional time series parameters of a third-order autoregression and the more interpretable parameters of secular half-life, cyclical half-life and cycle period. The latter parameterization is better suited to interpretation of results using both Bayesian and maximum likelihood methods and to expression of a substantive prior distribution using Bayesian methods. The paper demonstrates how to approach both problems using the sequentially adaptive Bayesian learning algorithm and sequentially adaptive Bayesian learning algorithm (SABL) software, which eliminates virtually of the substantial technical overhead required in conventional approaches and produces results quickly and reliably. The work utilizes methodological innovations in SABL including optimization of irregular and multimodal functions and production of the conventional maximum likelihood asymptotic variance matrix as a by-product

    Bayesian analysis of endogenous delay threshold models

    Get PDF
    We develop Bayesian methods of analysis for a new class of threshold autoregressive models: endogenous delay threshold. We apply our methods to the commonly used sunspot data set and find strong evidence in favor of the Endogenous Delay Threshold Autoregressive (EDTAR) model over linear and traditional threshold autoregressions

    Mining Frequent Graph Patterns with Differential Privacy

    Full text link
    Discovering frequent graph patterns in a graph database offers valuable information in a variety of applications. However, if the graph dataset contains sensitive data of individuals such as mobile phone-call graphs and web-click graphs, releasing discovered frequent patterns may present a threat to the privacy of individuals. {\em Differential privacy} has recently emerged as the {\em de facto} standard for private data analysis due to its provable privacy guarantee. In this paper we propose the first differentially private algorithm for mining frequent graph patterns. We first show that previous techniques on differentially private discovery of frequent {\em itemsets} cannot apply in mining frequent graph patterns due to the inherent complexity of handling structural information in graphs. We then address this challenge by proposing a Markov Chain Monte Carlo (MCMC) sampling based algorithm. Unlike previous work on frequent itemset mining, our techniques do not rely on the output of a non-private mining algorithm. Instead, we observe that both frequent graph pattern mining and the guarantee of differential privacy can be unified into an MCMC sampling framework. In addition, we establish the privacy and utility guarantee of our algorithm and propose an efficient neighboring pattern counting technique as well. Experimental results show that the proposed algorithm is able to output frequent patterns with good precision

    On the Unicity of Smartphone Applications

    Get PDF
    Prior works have shown that the list of apps installed by a user reveal a lot about user interests and behavior. These works rely on the semantics of the installed apps and show that various user traits could be learnt automatically using off-the-shelf machine-learning techniques. In this work, we focus on the re-identifiability issue and thoroughly study the unicity of smartphone apps on a dataset containing 54,893 Android users collected over a period of 7 months. Our study finds that any 4 apps installed by a user are enough (more than 95% times) for the re-identification of the user in our dataset. As the complete list of installed apps is unique for 99% of the users in our dataset, it can be easily used to track/profile the users by a service such as Twitter that has access to the whole list of installed apps of users. As our analyzed dataset is small as compared to the total population of Android users, we also study how unicity would vary with larger datasets. This work emphasizes the need of better privacy guards against collection, use and release of the list of installed apps.Comment: 10 pages, 9 Figures, Appeared at ACM CCS Workshop on Privacy in Electronic Society (WPES) 201

    Econometrics

    Full text link
    As a unified discipline, econometrics is still relatively young and has been transforming and expanding very rapidly. Major advances have taken place in the analysis of cross-sectional data by means of semiparametric and nonparametric techniques. Heterogeneity of economic relations across individuals, firms and industries is increasingly acknowledged and attempts have been made to take it into account either by integrating out its effects or by modelling the sources of heterogeneity when suitable panel data exist. The counterfactual considerations that underlie policy analysis and treatment valuation have been given a more satisfactory foundation. New time-series econometric techniques have been developed and employed extensively in the areas of macroeconometrics and finance. Nonlinear econometric techniques are used increasingly in the analysis of cross-section and time-series observations. Applications of Bayesian techniques to econometric problems have been promoted largely by advances in computer power and computational techniques. The use of Bayesian techniques has in turn provided the investigators with a unifying framework where the tasks of forecasting, decision making, model evaluation and learning can be considered as parts of the same interactive and iterative process, thus providing a basis for âreal time econometricsâ

    Alternative computational approaches to inference in the multinomial probit model

    Full text link
    This research compares several approaches to inference in the multinomial probit model, based on two Monte Carlo experiments for a seven choice model. The methods compared are the simulated maximum likelihood estimator using the GHK recursive probability simulator, the method of simulated moments estimator using the GHK recursive simulator and kernel-smoothed frequency simulators, and posterior means using a Gibbs sampling-data augmentation algorithm. Overall, the Gibbs sampling algorithm has a slight edge, with the relative performance of MSM and SML based on the GHK simulator being difficult to evaluate. The MSM estimator with the kernel-smoothed frequency simulator is clearly inferior. © 1994

    Sampling Triples from Restricted Networks Using MCMC Strategy

    Get PDF
    In large networks, the connected triples are useful for solving various tasks including link prediction, community detection, and spam filtering. Existing works in this direction concern mostly with the exact or approximate counting of connected triples that are closed (aka, triangles). Evidently, the task of triple sampling has not been explored in depth, although sampling is a more fundamental task than counting, and the former is useful for solving various other tasks, including counting. In recent years, some works on triple sampling have been proposed that are based on direct sampling, solely for the purpose of triangle count approximation. They sample only from a uniform distribution, and are not effective for sampling triples from an arbitrary user-defined distribution. In this work we present two indirect triple sampling methods that are based on Markov Chain Monte Carlo (MCMC) sampling strategy. Both of the above methods are highly efficient compared to a direct sampling-based method, specifically for the task of sampling from a non-uniform probability distribution. Another significant advantage of the proposed methods is that they can sample triples from networks that have restricted access, on which a direct sampling based method is simply not applicable

    Small versus big-data factor extraction in Dynamic Factor Models: An empirical assessment

    Get PDF
    In the context of Dynamic Factor Models (DFM), we compare point and interval estimates of the underlying unobserved factors extracted using small and big-data procedures. Our paper differs from previous works in the related literature in several ways. First, we focus on factor extraction rather than on prediction of a given variable in the system. Second, the comparisons are carried out by implementing the procedures considered to the same data. Third, we are interested not only on point estimates but also on confidence intervals for the factors. Based on a simulated system and the macroeconomic data set popularized by Stock and Watson (2012), we show that, for a given procedure, factor estimates based on different cross-sectional dimensions are highly correlated. On the other hand, given the cross-sectional dimension, the Maximum Likelihood Kalman filter and smoother (KFS) factor estimates are highly correlated with those obtained using hybrid Principal Components (PC) and KFS procedures. The PC estimates are somehow less correlated. Finally, the PC intervals based on asymptotic approximations are unrealistically tiny.Financial support from the Spanish Government projects ECO2012-32854 and ECO2012-32401 is acknowledged by the first and second authors respectivel
    • …
    corecore